[AMDGPU] Refactor insertWaveSizeFeature #154850

rampitec · 2025-08-21T21:59:57Z

If a wavefrontsize32 or wavefrontsize64 is the only possible value
insert it into feature list by default and use that value as an
indication that another wavefront size is not legal.

rampitec · 2025-08-21T22:00:20Z

[AMDGPU] More radical feature initialization refactoring #155222
[AMDGPU] Refactor insertWaveSizeFeature #154850 👈 (View in Graphite)
main

This stack of pull requests is managed by Graphite. Learn more about stacking.

llvmbot · 2025-08-21T22:01:31Z

@llvm/pr-subscribers-clang

Author: Stanislav Mekhanoshin (rampitec)

Changes

If a wavefrontsize32 or wavefrontsize64 is the only possible value
insert it into feature list by default and use that value as an
indication that another wavefront size is not legal.

Full diff: https://github.com/llvm/llvm-project/pull/154850.diff

3 Files Affected:

(modified) clang/lib/Basic/Targets/AMDGPU.cpp (+2-2)
(modified) clang/test/CodeGenOpenCL/amdgpu-features-illegal.cl (+2)
(modified) llvm/lib/TargetParser/TargetParser.cpp (+22-49)

diff --git a/clang/lib/Basic/Targets/AMDGPU.cpp b/clang/lib/Basic/Targets/AMDGPU.cpp
index 639e735202f26..dba1bab88c481 100644
--- a/clang/lib/Basic/Targets/AMDGPU.cpp
+++ b/clang/lib/Basic/Targets/AMDGPU.cpp
@@ -197,11 +197,11 @@ bool AMDGPUTargetInfo::initFeatureMap(
     const std::vector<std::string> &FeatureVec) const {
 
   using namespace llvm::AMDGPU;
-  fillAMDGPUFeatureMap(CPU, getTriple(), Features);
+
   if (!TargetInfo::initFeatureMap(Features, Diags, CPU, FeatureVec))
     return false;
 
-  // TODO: Should move this logic into TargetParser
+    // TODO: Should move this logic into TargetParser
   auto HasError = insertWaveSizeFeature(CPU, getTriple(), Features);
   switch (HasError.first) {
   default:
diff --git a/clang/test/CodeGenOpenCL/amdgpu-features-illegal.cl b/clang/test/CodeGenOpenCL/amdgpu-features-illegal.cl
index 4e2f7f86e8402..04de5dca3f6c0 100644
--- a/clang/test/CodeGenOpenCL/amdgpu-features-illegal.cl
+++ b/clang/test/CodeGenOpenCL/amdgpu-features-illegal.cl
@@ -1,8 +1,10 @@
 // RUN: not %clang_cc1 -triple amdgcn -target-feature +wavefrontsize32 -target-feature +wavefrontsize64 -o /dev/null %s 2>&1 | FileCheck %s
 // RUN: not %clang_cc1 -triple amdgcn -target-cpu gfx1103 -target-feature +wavefrontsize32 -target-feature +wavefrontsize64 -o /dev/null %s 2>&1 | FileCheck %s
 // RUN: not %clang_cc1 -triple amdgcn -target-cpu gfx900 -target-feature +wavefrontsize32 -o /dev/null %s 2>&1 | FileCheck %s --check-prefix=GFX9
+// RUN: not %clang_cc1 -triple amdgcn -target-cpu gfx1250 -target-feature +wavefrontsize64 -o /dev/null %s 2>&1 | FileCheck %s --check-prefix=GFX1250
 
 // CHECK: error: invalid feature combination: 'wavefrontsize32' and 'wavefrontsize64' are mutually exclusive
 // GFX9: error: option 'wavefrontsize32' cannot be specified on this target
+// GFX1250: error: option 'wavefrontsize64' cannot be specified on this target
 
 kernel void test() {}
diff --git a/llvm/lib/TargetParser/TargetParser.cpp b/llvm/lib/TargetParser/TargetParser.cpp
index 50b97d3257540..91ac4cc71d34b 100644
--- a/llvm/lib/TargetParser/TargetParser.cpp
+++ b/llvm/lib/TargetParser/TargetParser.cpp
@@ -471,6 +471,7 @@ void AMDGPU::fillAMDGPUFeatureMap(StringRef GPU, const Triple &T,
       Features["setprio-inc-wg-inst"] = true;
       Features["atomic-fmin-fmax-global-f32"] = true;
       Features["atomic-fmin-fmax-global-f64"] = true;
+      Features["wavefrontsize32"] = true;
       break;
     case GK_GFX1201:
     case GK_GFX1200:
@@ -638,6 +639,7 @@ void AMDGPU::fillAMDGPUFeatureMap(StringRef GPU, const Triple &T,
       Features["gws"] = true;
       Features["vmem-to-lds-load-insts"] = true;
       Features["atomic-fmin-fmax-global-f64"] = true;
+      Features["wavefrontsize64"] = true;
       break;
     case GK_GFX90A:
       Features["gfx90a-insts"] = true;
@@ -681,6 +683,7 @@ void AMDGPU::fillAMDGPUFeatureMap(StringRef GPU, const Triple &T,
       Features["image-insts"] = true;
       Features["s-memtime-inst"] = true;
       Features["gws"] = true;
+      Features["wavefrontsize64"] = true;
       break;
     case GK_GFX705:
     case GK_GFX704:
@@ -698,6 +701,7 @@ void AMDGPU::fillAMDGPUFeatureMap(StringRef GPU, const Triple &T,
       Features["gws"] = true;
       Features["atomic-fmin-fmax-global-f32"] = true;
       Features["atomic-fmin-fmax-global-f64"] = true;
+      Features["wavefrontsize64"] = true;
       break;
     case GK_NONE:
       break;
@@ -734,68 +738,37 @@ void AMDGPU::fillAMDGPUFeatureMap(StringRef GPU, const Triple &T,
   }
 }
 
-static bool isWave32Capable(StringRef GPU, const Triple &T) {
-  bool IsWave32Capable = false;
-  // XXX - What does the member GPU mean if device name string passed here?
-  if (T.isAMDGCN()) {
-    switch (parseArchAMDGCN(GPU)) {
-    case GK_GFX1250:
-    case GK_GFX1201:
-    case GK_GFX1200:
-    case GK_GFX1153:
-    case GK_GFX1152:
-    case GK_GFX1151:
-    case GK_GFX1150:
-    case GK_GFX1103:
-    case GK_GFX1102:
-    case GK_GFX1101:
-    case GK_GFX1100:
-    case GK_GFX1036:
-    case GK_GFX1035:
-    case GK_GFX1034:
-    case GK_GFX1033:
-    case GK_GFX1032:
-    case GK_GFX1031:
-    case GK_GFX1030:
-    case GK_GFX1012:
-    case GK_GFX1011:
-    case GK_GFX1013:
-    case GK_GFX1010:
-    case GK_GFX12_GENERIC:
-    case GK_GFX11_GENERIC:
-    case GK_GFX10_3_GENERIC:
-    case GK_GFX10_1_GENERIC:
-      IsWave32Capable = true;
-      break;
-    default:
-      break;
-    }
-  }
-  return IsWave32Capable;
-}
-
 std::pair<FeatureError, StringRef>
 AMDGPU::insertWaveSizeFeature(StringRef GPU, const Triple &T,
                               StringMap<bool> &Features) {
-  bool IsWave32Capable = isWave32Capable(GPU, T);
+  StringMap<bool> DefaultFeatures;
+  fillAMDGPUFeatureMap(GPU, T, DefaultFeatures);
+
   const bool IsNullGPU = GPU.empty();
+  const bool TargetHasWave32 = DefaultFeatures.count("wavefrontsize32");
+  const bool TargetHasWave64 = DefaultFeatures.count("wavefrontsize64");
   const bool HaveWave32 = Features.count("wavefrontsize32");
   const bool HaveWave64 = Features.count("wavefrontsize64");
   if (HaveWave32 && HaveWave64) {
     return {AMDGPU::INVALID_FEATURE_COMBINATION,
             "'wavefrontsize32' and 'wavefrontsize64' are mutually exclusive"};
   }
-  if (HaveWave32 && !IsNullGPU && !IsWave32Capable) {
+  if (HaveWave32 && !IsNullGPU && TargetHasWave64) {
     return {AMDGPU::UNSUPPORTED_TARGET_FEATURE, "wavefrontsize32"};
   }
+  if (HaveWave64 && !IsNullGPU && TargetHasWave32) {
+    return {AMDGPU::UNSUPPORTED_TARGET_FEATURE, "wavefrontsize64"};
+  }
   // Don't assume any wavesize with an unknown subtarget.
-  if (!IsNullGPU) {
-    // Default to wave32 if available, or wave64 if not
-    if (!HaveWave32 && !HaveWave64) {
-      StringRef DefaultWaveSizeFeature =
-          IsWave32Capable ? "wavefrontsize32" : "wavefrontsize64";
-      Features.insert(std::make_pair(DefaultWaveSizeFeature, true));
-    }
+  // Default to wave32 is target supports both.
+  if (!IsNullGPU && !HaveWave32 && !HaveWave64 && !TargetHasWave32 &&
+      !TargetHasWave64)
+    Features.insert(std::make_pair("wavefrontsize32", true));
+
+  for (const auto &Entry : DefaultFeatures) {
+    if (!Features.count(Entry.getKey()))
+      Features[Entry.getKey()] = Entry.getValue();
   }
+
   return {NO_ERROR, StringRef()};
 }

llvmbot · 2025-08-21T22:01:32Z

@llvm/pr-subscribers-backend-amdgpu

Author: Stanislav Mekhanoshin (rampitec)

Changes

If a wavefrontsize32 or wavefrontsize64 is the only possible value
insert it into feature list by default and use that value as an
indication that another wavefront size is not legal.

Full diff: https://github.com/llvm/llvm-project/pull/154850.diff

3 Files Affected:

(modified) clang/lib/Basic/Targets/AMDGPU.cpp (+2-2)
(modified) clang/test/CodeGenOpenCL/amdgpu-features-illegal.cl (+2)
(modified) llvm/lib/TargetParser/TargetParser.cpp (+22-49)

diff --git a/clang/lib/Basic/Targets/AMDGPU.cpp b/clang/lib/Basic/Targets/AMDGPU.cpp
index 639e735202f26..dba1bab88c481 100644
--- a/clang/lib/Basic/Targets/AMDGPU.cpp
+++ b/clang/lib/Basic/Targets/AMDGPU.cpp
@@ -197,11 +197,11 @@ bool AMDGPUTargetInfo::initFeatureMap(
     const std::vector<std::string> &FeatureVec) const {
 
   using namespace llvm::AMDGPU;
-  fillAMDGPUFeatureMap(CPU, getTriple(), Features);
+
   if (!TargetInfo::initFeatureMap(Features, Diags, CPU, FeatureVec))
     return false;
 
-  // TODO: Should move this logic into TargetParser
+    // TODO: Should move this logic into TargetParser
   auto HasError = insertWaveSizeFeature(CPU, getTriple(), Features);
   switch (HasError.first) {
   default:
diff --git a/clang/test/CodeGenOpenCL/amdgpu-features-illegal.cl b/clang/test/CodeGenOpenCL/amdgpu-features-illegal.cl
index 4e2f7f86e8402..04de5dca3f6c0 100644
--- a/clang/test/CodeGenOpenCL/amdgpu-features-illegal.cl
+++ b/clang/test/CodeGenOpenCL/amdgpu-features-illegal.cl
@@ -1,8 +1,10 @@
 // RUN: not %clang_cc1 -triple amdgcn -target-feature +wavefrontsize32 -target-feature +wavefrontsize64 -o /dev/null %s 2>&1 | FileCheck %s
 // RUN: not %clang_cc1 -triple amdgcn -target-cpu gfx1103 -target-feature +wavefrontsize32 -target-feature +wavefrontsize64 -o /dev/null %s 2>&1 | FileCheck %s
 // RUN: not %clang_cc1 -triple amdgcn -target-cpu gfx900 -target-feature +wavefrontsize32 -o /dev/null %s 2>&1 | FileCheck %s --check-prefix=GFX9
+// RUN: not %clang_cc1 -triple amdgcn -target-cpu gfx1250 -target-feature +wavefrontsize64 -o /dev/null %s 2>&1 | FileCheck %s --check-prefix=GFX1250
 
 // CHECK: error: invalid feature combination: 'wavefrontsize32' and 'wavefrontsize64' are mutually exclusive
 // GFX9: error: option 'wavefrontsize32' cannot be specified on this target
+// GFX1250: error: option 'wavefrontsize64' cannot be specified on this target
 
 kernel void test() {}
diff --git a/llvm/lib/TargetParser/TargetParser.cpp b/llvm/lib/TargetParser/TargetParser.cpp
index 50b97d3257540..91ac4cc71d34b 100644
--- a/llvm/lib/TargetParser/TargetParser.cpp
+++ b/llvm/lib/TargetParser/TargetParser.cpp
@@ -471,6 +471,7 @@ void AMDGPU::fillAMDGPUFeatureMap(StringRef GPU, const Triple &T,
       Features["setprio-inc-wg-inst"] = true;
       Features["atomic-fmin-fmax-global-f32"] = true;
       Features["atomic-fmin-fmax-global-f64"] = true;
+      Features["wavefrontsize32"] = true;
       break;
     case GK_GFX1201:
     case GK_GFX1200:
@@ -638,6 +639,7 @@ void AMDGPU::fillAMDGPUFeatureMap(StringRef GPU, const Triple &T,
       Features["gws"] = true;
       Features["vmem-to-lds-load-insts"] = true;
       Features["atomic-fmin-fmax-global-f64"] = true;
+      Features["wavefrontsize64"] = true;
       break;
     case GK_GFX90A:
       Features["gfx90a-insts"] = true;
@@ -681,6 +683,7 @@ void AMDGPU::fillAMDGPUFeatureMap(StringRef GPU, const Triple &T,
       Features["image-insts"] = true;
       Features["s-memtime-inst"] = true;
       Features["gws"] = true;
+      Features["wavefrontsize64"] = true;
       break;
     case GK_GFX705:
     case GK_GFX704:
@@ -698,6 +701,7 @@ void AMDGPU::fillAMDGPUFeatureMap(StringRef GPU, const Triple &T,
       Features["gws"] = true;
       Features["atomic-fmin-fmax-global-f32"] = true;
       Features["atomic-fmin-fmax-global-f64"] = true;
+      Features["wavefrontsize64"] = true;
       break;
     case GK_NONE:
       break;
@@ -734,68 +738,37 @@ void AMDGPU::fillAMDGPUFeatureMap(StringRef GPU, const Triple &T,
   }
 }
 
-static bool isWave32Capable(StringRef GPU, const Triple &T) {
-  bool IsWave32Capable = false;
-  // XXX - What does the member GPU mean if device name string passed here?
-  if (T.isAMDGCN()) {
-    switch (parseArchAMDGCN(GPU)) {
-    case GK_GFX1250:
-    case GK_GFX1201:
-    case GK_GFX1200:
-    case GK_GFX1153:
-    case GK_GFX1152:
-    case GK_GFX1151:
-    case GK_GFX1150:
-    case GK_GFX1103:
-    case GK_GFX1102:
-    case GK_GFX1101:
-    case GK_GFX1100:
-    case GK_GFX1036:
-    case GK_GFX1035:
-    case GK_GFX1034:
-    case GK_GFX1033:
-    case GK_GFX1032:
-    case GK_GFX1031:
-    case GK_GFX1030:
-    case GK_GFX1012:
-    case GK_GFX1011:
-    case GK_GFX1013:
-    case GK_GFX1010:
-    case GK_GFX12_GENERIC:
-    case GK_GFX11_GENERIC:
-    case GK_GFX10_3_GENERIC:
-    case GK_GFX10_1_GENERIC:
-      IsWave32Capable = true;
-      break;
-    default:
-      break;
-    }
-  }
-  return IsWave32Capable;
-}
-
 std::pair<FeatureError, StringRef>
 AMDGPU::insertWaveSizeFeature(StringRef GPU, const Triple &T,
                               StringMap<bool> &Features) {
-  bool IsWave32Capable = isWave32Capable(GPU, T);
+  StringMap<bool> DefaultFeatures;
+  fillAMDGPUFeatureMap(GPU, T, DefaultFeatures);
+
   const bool IsNullGPU = GPU.empty();
+  const bool TargetHasWave32 = DefaultFeatures.count("wavefrontsize32");
+  const bool TargetHasWave64 = DefaultFeatures.count("wavefrontsize64");
   const bool HaveWave32 = Features.count("wavefrontsize32");
   const bool HaveWave64 = Features.count("wavefrontsize64");
   if (HaveWave32 && HaveWave64) {
     return {AMDGPU::INVALID_FEATURE_COMBINATION,
             "'wavefrontsize32' and 'wavefrontsize64' are mutually exclusive"};
   }
-  if (HaveWave32 && !IsNullGPU && !IsWave32Capable) {
+  if (HaveWave32 && !IsNullGPU && TargetHasWave64) {
     return {AMDGPU::UNSUPPORTED_TARGET_FEATURE, "wavefrontsize32"};
   }
+  if (HaveWave64 && !IsNullGPU && TargetHasWave32) {
+    return {AMDGPU::UNSUPPORTED_TARGET_FEATURE, "wavefrontsize64"};
+  }
   // Don't assume any wavesize with an unknown subtarget.
-  if (!IsNullGPU) {
-    // Default to wave32 if available, or wave64 if not
-    if (!HaveWave32 && !HaveWave64) {
-      StringRef DefaultWaveSizeFeature =
-          IsWave32Capable ? "wavefrontsize32" : "wavefrontsize64";
-      Features.insert(std::make_pair(DefaultWaveSizeFeature, true));
-    }
+  // Default to wave32 is target supports both.
+  if (!IsNullGPU && !HaveWave32 && !HaveWave64 && !TargetHasWave32 &&
+      !TargetHasWave64)
+    Features.insert(std::make_pair("wavefrontsize32", true));
+
+  for (const auto &Entry : DefaultFeatures) {
+    if (!Features.count(Entry.getKey()))
+      Features[Entry.getKey()] = Entry.getValue();
   }
+
   return {NO_ERROR, StringRef()};
 }

github-actions · 2025-08-21T22:03:23Z

✅ With the latest revision this PR passed the C/C++ code formatter.

rampitec · 2025-08-22T07:56:48Z

llvm/lib/TargetParser/TargetParser.cpp

+    Features.insert(std::make_pair("wavefrontsize32", true));
+
+  for (const auto &Entry : DefaultFeatures) {
+    if (!Features.count(Entry.getKey()))


I wish there is a method to detach entry from one map and add it to another, but I didn't find one.

rampitec · 2025-08-25T23:19:53Z

Ping. Here I did that and just that -- rely purely on features. It is somewhat spaghetti, but it was so before. Then there is a follow-up to hide the implementation details from language drivers: #155222.

yxsamliu · 2025-08-26T04:16:26Z

clang/lib/Basic/Targets/AMDGPU.cpp


  using namespace llvm::AMDGPU;
-  fillAMDGPUFeatureMap(CPU, getTriple(), Features);
+


I am a bit concerned moving fillAMDGPUFeatureMap from here to insertWaveSizeFeature. Since we are supposed to initialize the feature map with the target defaults, then override them with command line options.

How about we keep fillAMDGPUFeatureMap here and make a copy of default target features and pass it to insertWaveSizeFeature as an extra argument. In insertWaveSizeFeature, we only use the default target feature for checking target support of wave32/64.

Check the next #155222. Here I have only done what you have asked for. There I have refactored the whole thing.

If you want I can squash both into a single commit, but I thought that way it is better reviewable.

If a wavefrontsize32 or wavefrontsize64 is the only possible value insert it into feature list by default and use that value as an indication that another wavefront size is not legal.

…wavesizefeature

rampitec requested review from arsenm and yxsamliu August 21, 2025 22:00

rampitec marked this pull request as ready for review August 21, 2025 22:01

llvmbot added clang Clang issues not falling into any other category backend:AMDGPU clang:frontend Language frontend issues, e.g. anything involving "Sema" labels Aug 21, 2025

rampitec mentioned this pull request Aug 21, 2025

[AMDGPU] Error out in clang if wavefront64 is used on gfx1250 #153693

Closed

rampitec force-pushed the users/rampitec/08-21-_amdgpu_refactor_insertwavesizefeature branch from a6ae3b4 to 9694899 Compare August 21, 2025 22:04

rampitec commented Aug 22, 2025

View reviewed changes

rampitec force-pushed the users/rampitec/08-21-_amdgpu_refactor_insertwavesizefeature branch 2 times, most recently from ecc47cd to ca5b895 Compare August 25, 2025 08:20

rampitec mentioned this pull request Aug 25, 2025

[AMDGPU] More radical feature initialization refactoring #155222

Merged

yxsamliu reviewed Aug 26, 2025

View reviewed changes

[AMDGPU] Refactor insertWaveSizeFeature

7ffe0f9

If a wavefrontsize32 or wavefrontsize64 is the only possible value insert it into feature list by default and use that value as an indication that another wavefront size is not legal.

rampitec force-pushed the users/rampitec/08-21-_amdgpu_refactor_insertwavesizefeature branch from c7e8319 to 7ffe0f9 Compare August 26, 2025 18:22

rampitec added 2 commits August 26, 2025 11:43

Merge branch 'main' into users/rampitec/08-21-_amdgpu_refactor_insert…

d73a04f

…wavesizefeature

Merge branch 'main' into users/rampitec/08-21-_amdgpu_refactor_insert…

686c922

…wavesizefeature

rampitec merged commit 8c6b7af into main Aug 27, 2025
9 checks passed

rampitec deleted the users/rampitec/08-21-_amdgpu_refactor_insertwavesizefeature branch August 27, 2025 07:30

aemerson mentioned this pull request Aug 30, 2025

[NFC] [clangd] [Modules] remove dot in log #156207

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[AMDGPU] Refactor insertWaveSizeFeature #154850

[AMDGPU] Refactor insertWaveSizeFeature #154850

rampitec commented Aug 21, 2025

Uh oh!

rampitec commented Aug 21, 2025 •

edited

Loading

Uh oh!

llvmbot commented Aug 21, 2025

Uh oh!

llvmbot commented Aug 21, 2025

Uh oh!

github-actions bot commented Aug 21, 2025 •

edited

Loading

Uh oh!

rampitec Aug 22, 2025

Uh oh!

rampitec commented Aug 25, 2025

Uh oh!

yxsamliu Aug 26, 2025

Uh oh!

rampitec Aug 26, 2025

Uh oh!

rampitec Aug 26, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants


		using namespace llvm::AMDGPU;
		fillAMDGPUFeatureMap(CPU, getTriple(), Features);

[AMDGPU] Refactor insertWaveSizeFeature #154850

[AMDGPU] Refactor insertWaveSizeFeature #154850

Conversation

rampitec commented Aug 21, 2025

Uh oh!

rampitec commented Aug 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented Aug 21, 2025

Uh oh!

llvmbot commented Aug 21, 2025

Uh oh!

github-actions bot commented Aug 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rampitec Aug 22, 2025

Choose a reason for hiding this comment

Uh oh!

rampitec commented Aug 25, 2025

Uh oh!

yxsamliu Aug 26, 2025

Choose a reason for hiding this comment

Uh oh!

rampitec Aug 26, 2025

Choose a reason for hiding this comment

Uh oh!

rampitec Aug 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

rampitec commented Aug 21, 2025 •

edited

Loading

github-actions bot commented Aug 21, 2025 •

edited

Loading

rampitec Aug 26, 2025 •

edited

Loading